MetaTISA: Metagenomic Translation Initiation Site Annotator for improving gene start prediction

نویسندگان

  • Gang-Qing Hu
  • Jiangtao Guo
  • Yongchu Liu
  • Huaiqiu Zhu
چکیده

SUMMARY We proposed a tool named MetaTISA with an aim to improve TIS prediction of current gene-finders for metagenomes. The method employs a two-step strategy to predict translation initiation sites (TISs) by first clustering metagenomic fragments into phylogenetic groups and then predicting TISs independently for each group in an unsupervised manner. As evaluated on experimentally verified TISs, MetaTISA greatly improves the accuracies of TIS prediction of current gene-finders. AVAILABILITY The C++ source code is freely available under the GNU GPL license via http://mech.ctb.pku.edu.cn/MetaTISA/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gene and translation initiation site prediction in metagenomic sequences

MOTIVATION Gene prediction in metagenomic sequences remains a difficult problem. Current sequencing technologies do not achieve sufficient coverage to assemble the individual genomes in a typical sample; consequently, sequencing runs produce a large number of short sequences whose exact origin is unknown. Since these sequences are usually smaller than the average length of a gene, algorithms mu...

متن کامل

Accuracy improvement for identifying translation initiation sites in microbial genomes

MOTIVATION At present the computational gene identification methods in microbial genomes have a high prediction accuracy of verified translation termination site (3' end), but a much lower accuracy of the translation initiation site (TIS, 5' end). The latter is important to the analysis and the understanding of the putative protein of a gene and the regulatory machinery of the translation. Impr...

متن کامل

Improving promoter prediction Improving promoter prediction for the NNPP2.2 algorithm: a case study using Escherichia coli DNA sequences

MOTIVATION Although a great deal of research has been undertaken in the area of promoter prediction, prediction techniques are still not fully developed. Many algorithms tend to exhibit poor specificity, generating many false positives, or poor sensitivity. The neural network prediction program NNPP2.2 is one such example. RESULTS To improve the NNPP2.2 prediction technique, the distance betw...

متن کامل

GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.

Improving the accuracy of prediction of gene starts is one of a few remaining open problems in computer prediction of prokaryotic genes. Its difficulty is caused by the absence of relatively strong sequence patterns identifying true translation initiation sites. In the current paper we show that the accuracy of gene start prediction can be improved by combining models of protein-coding and non-...

متن کامل

Deriving ribosomal binding site (RBS) statistical models from unannotated DNA sequences and the use of the RBS model for N-terminal prediction.

Accurate prediction of the position of translation initiation (N-terminal prediction) is a difficult problem. N-terminal prediction from DNA sequence alone is ambiguous is several candidate start sites are close to each other. Protein similarity search is usually unable to indicate the true start of a gene as it would require a strong protein sequence similarity at the N-terminal portion of a p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 25 14  شماره 

صفحات  -

تاریخ انتشار 2009